A method for style adaptation to spontaneous speech by using a semi-linear interpolation technique
نویسندگان
چکیده
This paper deals with a method for adapting a language model created fromwritten-text corpora to spontaneous speech by using a semi-linear interpolation technique. Sizes and topic coverages of spoken language corpora are usually far smaller those of written-text corpora. We propose an approach to adapt a base language model to the styles of spontaneous speech on the basis of the following assumptions. The words that are topic-independent, that is to say, common in spontaneous speech should be predicted mainly by a model created from spontaneous speech corpora (style model), while the base model is more reliable for predicting topic-related words, because they are di cult to predict from a model based on a small corpus. We classi ed all words into dis uencies and normal words. The normal words are classi ed into two more categories; common words and topic words according to mutual information. For each category, the quali ed models (base or style) with the optimal weights for linear interpolation are selected. In other words, a di erent linear combination of the models is used for each category of a predicted word. We conducted experiments by using a spoken-language corpus of Japanese for creating the style model. We achieved 159.1 in test-set perplexity compared with the baseline of 189.3 (simple linear interpolation) and the perplexity of the style speci c model, which was 230.7.
منابع مشابه
Performance evaluation of style adaptation for hidden semi-Markov model based speech synthesis
This paper describes a style adaptation technique using hidden semi-Markov model (HSMM) based maximum likelihood linear regression (MLLR). The HSMM-based MLLR technique can estimate regression matrices for affine transform of mean vectors of output and state duration distributions which maximize likelihood of adaptation data using EM algorithm. In this study, we apply this adaptation technique ...
متن کاملDomain Adaptation of Maximum Entropy Language Models
We investigate a recently proposed Bayesian adaptation method for building style-adapted maximum entropy language models for speech recognition, given a large corpus of written language data and a small corpus of speech transcripts. Experiments show that the method consistently outperforms linear interpolation which is typically used in such cases.
متن کاملAutomatic Transcription of Lecture Speech using Language Model Based on Speaking-Style Transformation of Proceeding Texts
For language modeling of spontaneous speech recognition, we propose a style transformation approach, which transforms written texts to a spoken-style language model. Since these two styles are largely different and thus direct transformation is difficult, we cascade two transformation methods; rule-based transformation to rewrite written-style texts to intermediate “verbatim” texts, and statist...
متن کاملUnsupervised Language Model Adaptation Using Word Classes for Spontaneous Speech Recognition
This paper proposes an unsupervised, batch-type, class-based language model adaptation method for spontaneous speech recognition. The word classes are automatically determined by maximizing the average mutual information between the classes using a training set. A class-based language model is built based on recognition hypotheses obtained using a general word-based language model, and linearly...
متن کاملA style control technique for speech synthesis using multiple regression HSMM
This paper presents a technique for controlling intuitively the degree or intensity of speaking styles and emotional expressions of synthetic speech. The conventional style control technique based on multiple regression HMM (MRHMM) has a problem that it is difficult to control phone duration of synthetic speech because HMM has no explicit parameter which models phone duration appropriately. To ...
متن کامل